Parallel Spectral Clustering Algorithm Based on Hadoop

نویسندگان

  • Yajun Cui
  • Yang Zhao
  • Kafei Xiao
  • Chenglong Zhang
  • Lei Wang
چکیده

Spectral clustering and cloud computing is emerging branch of computer science or related discipline. It overcome the shortcomings of some traditional clustering algorithm and guarantee the convergence to the optimal solution, thus have to the widespread attention. This article first introduced the parallel spectral clustering algorithm research background and significance, and then to Hadoop the cloud computing Framework has carried on the detailed introduction, then has carried on the related to spectral clustering is introduced, then introduces the spectral clustering arithmetic Method of parallel and relevant steps, finally made the related experiments, and the experiment are summarized.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Implementation of K-Means and Hierarchical Document Clustering on Hadoop

Document clustering is one of the important areas in data mining. Hadoop is being used by the Yahoo, Google, Face book and Twitter business companies for implementing real time applications. Email, social media blog, movie review comments, books are used for document clustering. This paper focuses on the document clustering using Hadoop. Hadoop is the new technology used for parallel computing ...

متن کامل

Ecology Parallel implementation of K-Means clustering algorithm based on mapReduce computing model of hadoop

In recent years, data clustering has been studied extensively and a lot of methods and theories have been achieved. However, with the development of the database and the popularity of Internet, a lot of new challenges such as Big Data and Cloud Computing lie in the research on data clustering. The paper presents a parallel k-means clustering algorithm based on MapReduce computing model of Hadoo...

متن کامل

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

متن کامل

Parallel and Scalabale Rules Based Classifier Using Map-reduce Paradigm on Hadoop Cloud

The huge amount of data being generated by today’s data acquisition and processing technologies. Extracting hidden information is become practically impossible from such huge datasets, even then there are several data mining tasks like classification, association rule, clustering, etc. are used for information extractions. Data mining task, classification, consists of identifying a class to a s...

متن کامل

Parallel Power Iteration Clustering for Big Data using MapReduce in Hadoop

In today’s life Distributed Data Mining is most popular topic in research area because as data are increasing in day to day life there are so many problems occurs to handle them and there are also a solutions for that but still they are not as per expectation, still there are some issue already there in the Distributed Data Mining, among them mainly we are focus in this papers that about reduci...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1506.00227  شماره 

صفحات  -

تاریخ انتشار 2015